Adding PyGILState_Check() in object_api<>::operator(). #2919

rwgk · 2021-03-26T16:03:30Z

Keywords: callbacks, Python Global Interpreter Lock (GIL)

Problem addressed: GIL not held when C++ calls back into Python. This is a fairly common accident, easily overlooked, and notoriously difficult/time-consuming to diagnose.

This PR adds a safety guard in one strategic location, guarded by NDEBUG.

If NDEBUG is not defined the guard ensures that the Python GIL is held when C++ calls back into Python via object_api<>::operator() (e.g. py::function __call__). The worst-case runtime overhead (for a callback that returns immediately without doing any work) is ~10%. See detailed timings under Adding PyGILState_Check() in object_api<>::operator(). #2919 (comment). The typical runtime overhead (callback doing work) is probably much smaller.
If NDEBUG is defined, the guard is disabled, there is no protection, and zero runtime overhead.

This PR was tested in the Google internal environment, which lead to the discovery of a few GIL-related issues and their cleanup. (The Google internal testing currently covers a couple hundred pybind11 extensions, including many imported or exported OSS packages.)

Suggested changelog entry:

Add a safety guard to ensure that the Python GIL is held when C++ calls back
into Python via ``object_api<>::operator()`` (e.g. ``py::function`` ``__call__``).
(This feature is available for Python 3.6+ only.)

rwgk · 2021-04-01T21:14:48Z

Estimation of worst-case runtime overhead.
Result: 17.261 / 15.510 = 1.113 (ratio of Mean values below).

Compiler: Debian clang version 11.0.1-2

clang++ -o pybind11/tests/test_callbacks.os -c -std=c++17 -fPIC -fvisibility=hidden -Os -flto -Wall -Wextra -Wconversion -Wcast-qual -Wdeprecated -Wnon-virtual-dtor -DNDEBUG -DPYBIND11_STRICT_ASSERTS_CLASS_HOLDER_VS_TYPE_CASTER_MIX -DPYBIND11_TEST_BOOST -Ipybind11/include -I/usr/local/google/home/rwgk/forked/pybind11/include -I/usr/local/google/home/rwgk/clone/pybind11/include -I/usr/include/python3.8 -I/usr/include/eigen3 /usr/local/google/home/rwgk/forked/pybind11/tests/test_callbacks.cpp
clang++ -o lib/pybind11_tests.so -shared -fPIC -Os -flto -shared pybind11/tests/pybind11_tests.os pybind11/tests/test_callbacks.os

Results with PR as-is (except with num_millions = 20, repeats = 10 for better accuracy):

============================= test session starts ==============================
platform linux -- Python 3.8.7, pytest-5.4.3, py-1.9.0, pluggy-0.13.1
rootdir: /usr/local/google/home/rwgk/forked/pybind11/tests, inifile: pytest.ini
plugins: repeat-0.9.1
collected 10 items / 9 deselected / 1 selected

test_callbacks.py 
callback_num_times: 20 million / 1.135 seconds = 17.622 million / second
callback_num_times: 20 million / 1.134 seconds = 17.632 million / second
callback_num_times: 20 million / 1.136 seconds = 17.602 million / second
callback_num_times: 20 million / 1.135 seconds = 17.621 million / second
callback_num_times: 20 million / 1.204 seconds = 16.607 million / second
callback_num_times: 20 million / 1.134 seconds = 17.642 million / second
callback_num_times: 20 million / 1.133 seconds = 17.660 million / second
callback_num_times: 20 million / 1.199 seconds = 16.677 million / second
callback_num_times: 20 million / 1.142 seconds = 17.519 million / second
callback_num_times: 20 million / 1.248 seconds = 16.026 million / second
Min    Mean   Max
16.026 17.261 17.660
.                                                      [100%]

======================= 1 passed, 9 deselected in 11.62s =======================

Results with this diff:

diff --git a/include/pybind11/cast.h b/include/pybind11/cast.h
index 32257074..c55d1007 100644
--- a/include/pybind11/cast.h
+++ b/include/pybind11/cast.h
@@ -1348,11 +1348,11 @@ unpacking_collector<policy> collect_arguments(Args &&...args) {
 template <typename Derived>
 template <return_value_policy policy, typename... Args>
 object object_api<Derived>::operator()(Args &&...args) const {
-#if defined(NDEBUG) && PY_VERSION_HEX >= 0x03060000
+//#if !defined(NDEBUG) && PY_VERSION_HEX >= 0x03060000
     if (!PyGILState_Check()) {
         pybind11_fail("pybind11::object_api<>::operator() PyGILState_Check() failure.");
     }
-#endif
+//#endif
     return detail::collect_arguments<policy>(std::forward<Args>(args)...).call(derived().ptr());
 }

Running tests in directory "/usr/local/google/home/rwgk/forked/pybind11/tests":
============================= test session starts ==============================
platform linux -- Python 3.8.7, pytest-5.4.3, py-1.9.0, pluggy-0.13.1
rootdir: /usr/local/google/home/rwgk/forked/pybind11/tests, inifile: pytest.ini
plugins: repeat-0.9.1
collected 10 items / 9 deselected / 1 selected

test_callbacks.py 
callback_num_times: 20 million / 1.288 seconds = 15.528 million / second
callback_num_times: 20 million / 1.293 seconds = 15.463 million / second
callback_num_times: 20 million / 1.292 seconds = 15.479 million / second
callback_num_times: 20 million / 1.289 seconds = 15.513 million / second
callback_num_times: 20 million / 1.294 seconds = 15.460 million / second
callback_num_times: 20 million / 1.293 seconds = 15.469 million / second
callback_num_times: 20 million / 1.286 seconds = 15.549 million / second
callback_num_times: 20 million / 1.283 seconds = 15.591 million / second
callback_num_times: 20 million / 1.288 seconds = 15.528 million / second
callback_num_times: 20 million / 1.289 seconds = 15.519 million / second
Min    Mean   Max
15.460 15.510 15.591
.                                                      [100%]

======================= 1 passed, 9 deselected in 12.92s =======================

EricCousineau-TRI · 2021-04-02T18:56:00Z

Overall, looks great to me! I think the usage of NDEBUG sounds like an excellent utility to avoid undesired performance loss!

Though was kinda hard for me to initially interpret benchmark results as a frequency in GHz.
Is it still sufficient informative print stats to invert frequency to time for micro-benchmark (e.g. ns per call)?
(this also could be a common perf. benchmark thing that I'm not familiar with)

EricCousineau-TRI

Looks great! Would love to get weigh-in from someone else on this, just to be sure!

tests/test_callbacks.py

Possibly, this explains why PyGILState_Check() cannot safely be used with Python 3.4 and 3.5: python/cpython#10267 (comment)

…results).

henryiii · 2021-04-02T22:54:43Z

Please do not remove the changelog entry section from the template, it gets automatically collected. Plus it is a nice one sentence summary of what you are doing. :) (I think I'll be adding this comment to my auto-replies, or adding a bot to comment this automatically...)

henryiii

Two easy change recommendations from me.

henryiii · 2021-04-02T22:59:31Z

tests/test_callbacks.py

@@ -146,3 +147,33 @@ def test_async_async_callbacks():
    t = Thread(target=test_async_callbacks)
    t.start()
    t.join()
+
+
+def test_callback_num_times(capsys):


Eventually, we could require pytest-benchmark, or something similar. We could even do it manually; add a benchmark mark and if that mark is passed in, run the benchmark tests, and otherwise not run this or run this for a minimal number of times.

Actually, I don't really like a printout that always happens on all test runs. I rather think this should be opt in (via a mark).

FWIW pytest always captures the output, and only reports on error, right?
Hence the recommendation for stuff like --capture=no here: https://github.com/pybind/pybind11/blob/f676782becc5bcd30c56b4da1ff1026db61bcce3/.github/CONTRIBUTING.md#configuration-options

So a defect in some contexts, but prolly not here? (i.e. it's probably good to let it run since it's short?)

I removed capsys entirely. Simply running with -k test_callback_num_times -s is easy and best.

Okay. I think we should probably add a mark for this eventually, but not for just one item yet. @EricCousineau-TRI I believe @rwgk disabled capturing explicitly so that this would print out before, but he's removed that now.

Ah, derp 🤦 Thanks for explaining!

tests/test_callbacks.py

…ing `.format()`.

rwgk · 2021-04-03T00:06:10Z

Overall, looks great to me! I think the usage of NDEBUG sounds like an excellent utility to avoid undesired performance loss!

Though was kinda hard for me to initially interpret benchmark results as a frequency in GHz.
Is it still sufficient informative print stats to invert frequency to time for micro-benchmark (e.g. ns per call)?
(this also could be a common perf. benchmark thing that I'm not familiar with)

I overlooked this comment before, but I will leave this as calls / second, mainly because I don't want to spend the extra time updating the comment with the results, but also because that seems a lot more intuitive to me, feeling like speaking for people who don't do benchmarks all the time (most).

henryiii · 2021-04-03T00:55:08Z

Please remember to add the changelog entry to the issue description! I can copy-and-paste the template back in.

rwgk · 2021-04-03T01:15:21Z

Please remember to add the changelog entry to the issue description! I can copy-and-paste the template back in.

Done.

Thanks @henryiii and @EricCousineau-TRI !

rwgk marked this pull request as ready for review April 2, 2021 18:14

rwgk requested review from EricCousineau-TRI and henryiii April 2, 2021 18:15

EricCousineau-TRI approved these changes Apr 2, 2021

View reviewed changes

tests/test_callbacks.py Outdated Show resolved Hide resolved

rwgk added 6 commits April 2, 2021 12:28

Adding PyGILState_Check() in object_api<>::operator().

64e47cd

Enabling PyGILState_Check() for Python >= 3.6 only.

75cc8bd

Possibly, this explains why PyGILState_Check() cannot safely be used with Python 3.4 and 3.5: python/cpython#10267 (comment)

Adding simple micro benchmark.

155ac1a

Reducing test time to minimum (purely for coverage, not for accurate …

f6b6160

…results).

Fixing silly oversight.

02b8d23

Minor code organization improvement in test.

ef5a014

rwgk force-pushed the master_py_gil_state_check branch from 491d670 to ef5a014 Compare April 2, 2021 19:40

Adding example runtimes.

682c31a

henryiii reviewed Apr 2, 2021

View reviewed changes

Removing capsys (just run with -k test_callback_num_times -s and us…

08e18c5

…ing `.format()`.

henryiii approved these changes Apr 3, 2021

View reviewed changes

rwgk merged commit ad6bf5c into pybind:master Apr 3, 2021

rwgk deleted the master_py_gil_state_check branch April 3, 2021 01:17

github-actions bot added the needs changelog Possibly needs a changelog entry label Apr 3, 2021

henryiii removed the needs changelog Possibly needs a changelog entry label Jul 13, 2021

juso40 added a commit to juso40/pydImgui that referenced this pull request Aug 17, 2021

Remove NDEBUG from release build, pybind/pybind11#2919

a06df53

Infinoid mentioned this pull request Jan 1, 2022

Python test failures with pybind11 v2.8.1 halide/Halide#6524

Closed

rwgk mentioned this pull request Feb 10, 2023

FWD pybind11 google/pybind11clif#2919

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding PyGILState_Check() in object_api<>::operator(). #2919

Adding PyGILState_Check() in object_api<>::operator(). #2919

rwgk commented Mar 26, 2021 •

edited by henryiii

Loading

rwgk commented Apr 1, 2021 •

edited

Loading

EricCousineau-TRI commented Apr 2, 2021

EricCousineau-TRI left a comment

henryiii commented Apr 2, 2021

henryiii left a comment

henryiii Apr 2, 2021 •

edited

Loading

EricCousineau-TRI Apr 2, 2021

rwgk Apr 2, 2021

henryiii Apr 3, 2021

EricCousineau-TRI Apr 3, 2021

rwgk commented Apr 3, 2021

henryiii commented Apr 3, 2021

rwgk commented Apr 3, 2021

Adding PyGILState_Check() in object_api<>::operator(). #2919

Adding PyGILState_Check() in object_api<>::operator(). #2919

Conversation

rwgk commented Mar 26, 2021 • edited by henryiii Loading

Suggested changelog entry:

rwgk commented Apr 1, 2021 • edited Loading

EricCousineau-TRI commented Apr 2, 2021

EricCousineau-TRI left a comment

Choose a reason for hiding this comment

henryiii commented Apr 2, 2021

henryiii left a comment

Choose a reason for hiding this comment

henryiii Apr 2, 2021 • edited Loading

Choose a reason for hiding this comment

EricCousineau-TRI Apr 2, 2021

Choose a reason for hiding this comment

rwgk Apr 2, 2021

Choose a reason for hiding this comment

henryiii Apr 3, 2021

Choose a reason for hiding this comment

EricCousineau-TRI Apr 3, 2021

Choose a reason for hiding this comment

rwgk commented Apr 3, 2021

henryiii commented Apr 3, 2021

rwgk commented Apr 3, 2021

rwgk commented Mar 26, 2021 •

edited by henryiii

Loading

rwgk commented Apr 1, 2021 •

edited

Loading

henryiii Apr 2, 2021 •

edited

Loading